Scalable Kernel K-Means Clustering with Nystrom Approximation: Relative-Error Bounds

نویسندگان

  • Shusen Wang
  • Alex Gittens
  • Michael W. Mahoney
چکیده

Kernel k-means clustering can correctly identify and extract a far more varied collection of cluster structures than the linear k-means clustering algorithm. However, kernel kmeans clustering is computationally expensive when the non-linear feature map is highdimensional and there are many input points. Kernel approximation, e.g., the Nyström method, has been applied in previous works to approximately solve kernel learning problems when both of the above conditions are present. This work analyzes the application of this paradigm to kernel k-means clustering, and shows that applying the linear k-means clustering algorithm to k (1 + o(1)) features constructed using a so-called rank-restricted Nyström approximation results in cluster assignments that satisfy a 1 + approximation ratio in terms of the kernel k-means cost function, relative to the guarantee provided by the same algorithm without the use of the Nyström method. As part of the analysis, this work establishes a novel 1 + relative-error trace norm guarantee for low-rank approximation using the rank-restricted Nyström approximation. Empirical evaluations on the 8.1 million instance MNIST8M dataset demonstrate the scalability and usefulness of kernel k-means clustering with Nyström approximation. This work argues that spectral clustering using Nyström approximation—a popular and computationally efficient, but theoretically unsound approach to non-linear clustering— should be replaced with the efficient and theoretically sound combination of kernel k-means clustering with Nyström approximation. The superior performance of the latter approach is empirically verified.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Randomized Clustered Nystrom for Large-Scale Kernel Machines

The Nyström method has been popular for generating the low-rank approximation of kernel matrices that arise in many machine learning problems. The approximation quality of the Nyström method depends crucially on the number of selected landmark points and the selection procedure. In this paper, we present a novel algorithm to compute the optimal Nyström low-approximation when the number of landm...

متن کامل

Scalable Kernel k-Means via Centroid Approximation

Although kernel k-means is central for clustering complex data such as images and texts by implicit feature space embedding, its practicality is limited by the quadratic computational complexity. In this paper, we present a novel technique based on scalable centroid approximation that accelerates kernel k-means down to a sub-quadratic complexity. We prove near-optimality of our algorithm, and e...

متن کامل

The Modified Nystrom Method: Theories, Algorithms, and Extension

Symmetric positive semidefinite (SPSD) matrix approximation is an important problem with applications in kernel methods. However, existing SPSD matrix approximation methods such as the Nyström method only have weak error bounds. In this paper we conduct in-depth studies of an SPSD matrix approximation model and establish strong relative-error bounds. We call it the prototype model for it has mo...

متن کامل

Beyond the Nystrom Approximation: Speeding up Spectral Clustering using Uniform Sampling and Weighted Kernel k-means

In this paper we present a framework for spectral clustering based on the following simple scheme: sample a subset of the input points, compute the clusters for the sampled subset using weighted kernel k-means (Dhillon et al. 2004) and use the resulting centers to compute a clustering for the remaining data points. For the case where the points are sampled uniformly at random without replacemen...

متن کامل

Scalable Kernel Clustering: Approximate Kernel k-means

Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data. Among various kernel-based clustering algorithms, kernel k -means has gained popularity due to its simple iterative nature and ease of implementation. However, its run-time complexity and memory footprint increase quadratically in terms of the size of the data set, and hence, large data s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1706.02803  شماره 

صفحات  -

تاریخ انتشار 2017